Automatic Parallelization of Non-uniform Dependences
نویسندگان
چکیده
This report summarizes our current experiences with Automatic Program Parallelization tools for converting sequential Fortran code for use on a multiprocessor computer. A number of such tools were evaluated, including Parafrase, Adaptor, PAT, Petit and the SUIF compiler package. We evaluated the suitability of such tools for parallelizing Computational Fluid Dynamics code supplied by the Army Research Laboratory, Aberdeen Proving Grounds. SUIF was found to be most suitable by carrying out extensive tests on a suite of test programs and a matrix multiplication program. As a result of these experiments we suggest some modiications to the existing SUIF toolkit for eecient parallelization of the CFD code. Although SUIF does eecient loop partitioning for uniform dependences, it cannot handle nested loops with irregular dependencies eeciently. Unlike the case of nested loops with uniform dependencies these will have a complicated dependence pattern which forms a non-uniform dependence vector set. We propose to incorporate additional passes in SUIF based on results from our previous research, to generate code which will handle applications with non-uniform dependences.
منابع مشابه
Unique Sets Oriented Parallelization of Loops with Non-Uniform Dependences
Although many methods exist for nested loop partitioning, most of them perform poorly when parallelizing loops with non-uniform dependences. This paper addresses the issue of automatic parallelization of loops with non-uniform dependences. Such loops are normally not parallelized by existing parallelizing compilers and transformations. Even when parallelized in rare instances, the performance i...
متن کاملLimits of dependence analysis for automatic parallelization
Automatic parallelization is an increasingly important technique for accelerating sequential applications on multicore processors. This approach relies on having an accurate compile-time dependence analysis to identify independent sections of code. Previously it has been assumed that improving this analysis would also improve the performance of parallelized code. In this paper we use novel prof...
متن کاملComputer Science Technical Report Canonic Multi-Projection: Memory Allocation for Distributed Memory Parallelization
The Polyhedral model is now the accepted technology for automatic parallelization of affine control loop programs. It has been successful in automatically generating tiled shared memory parallel programs for shared memory platforms (plus vectorization). We address the challenges arising when we move toward distributed memory parallelization, based on wavefront execution of parameterized tiles. ...
متن کاملGeneration of Synchronous Code for Automatic Parallelization of while Loops
Automatic parallelization of imperative programs has focused on nests of do loops with aane bounds and aane dependences, because in this case execution domains and dependences are precisely known at compile-time. Parallelization can then be done using a suitable space-time transformation , yielding a logically synchronous program. Code generation consists of scanning the transformed execution d...
متن کاملParallelization Techniques with Improved Dependence Handling
Continuing exponential growth in transistor density and diminishing returns from the increasing transistor count have forced processor manufacturers to pack multiple processor cores onto a single chip. These processors, known as multi-core processors, generally do not improve the performance of single-threaded applications. Automatic parallelization has a key role to play in improving the perfo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996